Improving Speech Enhancement Performance by Leveraging Contextual Broad Phonetic Class Information

نویسندگان

چکیده

Previous studies have confirmed that by augmenting acoustic features with the place/manner of articulatory features, speech enhancement (SE) process can be guided to consider broad phonetic properties input when performing attain performance improvements. In this paper, we explore contextual information attributes as additional further benefit SE. More specifically, propose improve SE leveraging losses from an end-to-end automatic recognition (E2E-ASR) model predicts sequence classes (BPCs). We also developed multi-objective training ASR and perceptual train system based on a BPC-based E2E-ASR. Experimental results denoising, dereverberation, impaired tasks BPC improves performance. Moreover, trained E2E-ASR outperforms phoneme-based The suggest objectives misclassification phonemes may lead imperfect feedback, could potentially better choice. Finally, it is noted combining most-confusable targets into same calculating objective effectively

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Syllable Segmentation Using Broad Phonetic Class Information

We propose in this paper a language-independent method for syllable segmentation. The method is based on the Sonority Sequencing Principle, by which the sonority inside a syllable increases from its boundaries towards the syllabic nucleus. The sonority function employed was derived from the posterior probabilities of a broad phonetic class recognizer, trained with data coming from an open-sourc...

متن کامل

Language-independent Automatic Syllable Segmentation Using Broad Phonetic Class Information

متن کامل

On Improving Face Detection Performance by Modelling Contextual Information

In this paper we present a new method to enhance object detection by removing false alarms and merging multiple detections in a principled way with few parameters. The method models the output of an object classifier which we consider as the context. A hierarchical model is built using the detection distribution around a target sub-window to discriminate between false alarms and true detections...

متن کامل

Bio-inspired Broad-class Phonetic Labelling

Recent studies have shown that the correct labeling of phonetic classes may help current Automatic Speech Recognition (ASR) when combined with classical parsing automata based on Hidden Markov Models (HMM). Through the present paper a method for Phonetic Class Labeling (PCL) based on bio-inspired speech processing is described. The methodology is based in the automatic detection of formants and...

متن کامل

The use of broad phonetic class models in speaker recognition

In this paper we investigate the use of broad phonetic class (BPC) models in a text independent speaker recognition task. These models can be used to bring down the variability due to the intrinsic differences between mutual phonetic classes in the speech material used for training of the speaker models. Combining BPC recognition with text independent speaker recognition moves a bit in the dire...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing

سال: 2023

ISSN: ['2329-9304', '2329-9290']

DOI: https://doi.org/10.1109/taslp.2023.3288418